Relevance Vector Machines for classifying points and regions in biological sequences
نویسندگان
چکیده
The Relevance Vector Machine (RVM) is a recently developed machine learning framework capable of building simple models from large sets of candidate features. Here, we describe a protocol for using the RVM to explore very large numbers of candidate features, and a family of models which apply the power of the RVM to classifying and detecting interesting points and regions in biological sequence data. The models described here have been used successfully for predicting transcription start sites and other features in genome sequences.
منابع مشابه
Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM
Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...
متن کاملClassifying Rna Secondary Structures Using Support Vector
CLASSIFYING RNA SECONDARY STRUCTURES USING SUPPORT VECTOR MACHINES by PrathyUsha Sunkara In contrast to DNA, RNA prevails as a single strand. As a consequence of small selfcomplementary regions, RNA commonly exhibits an intricate secondary structure, consisting of relatively short, double helical segments alternated with single stranded regions. The amount of sequence data available is rising r...
متن کاملDetection of Cardiac Hypertrophy by RVM and SVM Algorithms
The meaning of the hypertropy word is the increasing size.Heart hypertropy is symptoms of increase the thickness of the heart muscle that the left ventricular hypertrophy of them is the most common.The causes of hypertrophy heart disease are high blood pressure , aortic valve stenosis and sport activities respectively. Assessment of that by using ECG signal analysis is essential Because the ris...
متن کاملEvaluation of Techniques for Classifying Biological Sequences* Evaluation of Techniques for Classifying Biological Sequences* Evaluation of Techniques for Classifying Biological Sequences *
In recent years we have witnessed an exponential increase in the amount of biological information, either DNA or protein sequences, that has become available in public databases. This has been followed by an increased interest in developing computational techniques to automatically classify these large volumes of sequence data into various categories corresponding to either their role in the ch...
متن کاملEngineering support vector machine kernels that recognize translation initiation sites
Motivation: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). Results: The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to i...
متن کامل